Purposes Develop and validate a predictive scoring system for therapy failure in persons with chronic phase chronic myeloid leukemia (CML) receiving initial a second-generation tyrosine kinase inhibitor (2G-TKI).

Methods Data from 329 consecutive subjects with CML receiving initial 2G-TKI therapy at one centre (Beijing, China) in a training dataset were interrogated to develop a predictive model which was then validated in 1,083 subjects from 65 other centers in China. Therapy-failure was defined using the 2020 ELN criteria including progression to accelerated or blast phases or CML-related death. Fine-Gray model was used for uni- and multi-variable analyses (UVA and MVA) to identify co-variates significantly-associated with therapy-failure. Akaike information criterion (AIC) was used to select prognostic co-variate and determine the best model. 1000 bootstrap re-samplings, Fine-Gray test, the minimal p-value approach, Bonferroni correction and kernel density estimator were used to determine optimal cut-offs to classify subjects into risk cohorts. Time-dependent receiver-operator characteristic curves (ROC) were used to estimate prediction accuracy and calibration plots used to determine concordance between predicted and observed cumulative incidences. Decision curve analysis (DCA) was used to calculate the net benefit using the model.

Results In the training dataset, 279 subjects (85%) initially received nilotinib; and 50 (15%), dasatinib. Median 2G-TKI therapy-duration was 36 months (interquartile range [IQR], 24-68 months). 75 subjects (23%) had therapy-failure at a median of 9 months (IQR 3-12 months) after therapy start. In MVA, increasing splenomegaly and a higher percentage blood blasts were significantly-associated with the higher cumulative incidences of therapy-failure. We used these data to develop a predictive scoring system: "2G-TKI failure risk score= 0.6405×[(spleen size below the costal margin in cm + 0.2)/10] + 0.1348×(blood blasts + 0.2)" which divided subjects into low- (score ≤ 0.3021; n = 89; 31%), intermediate- (0.3021 < score < 1.4508; n = 155; 54%) and high-risk (score ≥ 1.4508; n = 43; 15%) cohorts with 5-year cumulative incidences of therapy-failure of 3% (95% Confidence Interval [CI], 0, 5%), 26% (20, 32%) and 72% (56, 88%; p-value for trend < 0.001; Figure 1A). Hazard ratios (HRs) for therapy-failure (low-risk cohort as reference) were 4.8 (1.7, 13.6; p = 0.002) and 16.5 (5.7, 47.6; p < 0.001; p for trend < 0.001) for the intermediate- and high-risk cohorts. In the external validation dataset, subjects were classified into the low- (n = 433; 40%), intermediate- (n = 583; 54%) and high-risk (n = 67; 6%) cohorts using our scoring system. 5-year cumulative incidences of therapy failure were 8% (3, 13%), 18% (12, 25%) and 44% (29, 60%; p for trend < 0.001; Figure 1B). HRs (low-risk cohort as reference) were 2.7 (1.9, 3.6; p < 0.001) and 10.6 (5.8, 15.4; p < 0.001; p for trend < 0.001) for the intermediate- and high-risk cohorts. In the training dataset cohort time-dependent AUROC curves for therapy-failure at 1-, 3-, and 5-years in the model were 0.73-0.82 and 0.68-0.75 in the validation dataset. Calibration plots for the 1-, 3- and 5-years cumulative incidences of therapy-failure indicated good concordance between predicted and observed outcomes. DCA curves indicated net benefit from using the predictive scoring system in clinical settings.

Conclusions We developed and externally validated a therapy-failure scoring system in persons with chronic phase CML receiving initial 2G-TKI. The model had high AUROCs and performed well in calibration plots and DCAs. Our model may help physicians decide appropriateness of initial the 2G-TKI therapy including possible selection of appropriate haematopoietic cell transplant candidates.

No relevant conflicts of interest to declare.

Author notes

*

Asterisk with author names denotes non-ASH members.

Sign in via your Institution